Pesquisa | Portal Regional da BVS

1.

Investigating Protein Structure and Evolution with SCOP2.

Andreeva, Antonina; Howorth, Dave; Chothia, Cyrus; Kulesha, Eugene; Murzin, Alexey G.

Curr Protoc Bioinformatics ; 49: 1.26.1-1.26.21, 2015 Mar 09.

Artigo em Inglês | MEDLINE | ID: mdl-25754991

RESUMO

SCOP2 is a successor to the Structural Classification of Proteins (SCOP) database that organizes proteins of known structure according to their structural and evolutionary relationships. It was designed to provide a more advanced framework for the classification of proteins. The SCOP2 classification is described in terms of a directed acyclic graph in which each node defines a relationship of particular type that is represented by a region of protein structure and sequence. The SCOP2 data are accessible via SCOP2-Browser and SCOP2-Graph. This protocol unit describes different ways to explore and investigate the SCOP2 evolutionary and structural groupings.

Assuntos

Bases de Dados de Proteínas , Evolução Molecular , Proteínas/química , Sequência de Aminoácidos , Internet , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína

2.

Genome3D: exploiting structure to help users understand their sequences.

Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cozzetto, Domenico; Dana, José M; Filippis, Ioannis; Gough, Julian; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mistry, Jaina; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Oates, Matt E; Punta, Marco; Rackham, Owen J L; Stahlhacke, Jonathan; Sternberg, Michael J E; Velankar, Sameer; Orengo, Christine.

Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25348407

RESUMO

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.

Assuntos

Bases de Dados de Proteínas , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Algoritmos , Genômica , Internet , Modelos Moleculares , Estrutura Terciária de Proteína/genética , Análise de Sequência de Proteína

3.

SCOP2 prototype: a new approach to protein structure mining.

Andreeva, Antonina; Howorth, Dave; Chothia, Cyrus; Kulesha, Eugene; Murzin, Alexey G.

Nucleic Acids Res ; 42(Database issue): D310-4, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24293656

RESUMO

We present a prototype of a new structural classification of proteins, SCOP2 (http://scop2.mrc-lmb.cam.ac.uk/), that we have developed recently. SCOP2 is a successor to the Structural Classification of Proteins (SCOP, http://scop.mrc-lmb.cam.ac.uk/scop/) database. Similarly to SCOP, the main focus of SCOP2 is to organize structurally characterized proteins according to their structural and evolutionary relationships. SCOP2 was designed to provide a more advanced framework for protein structure annotation and classification. It defines a new approach to the classification of proteins that is essentially different from SCOP, but retains its best features. The SCOP2 classification is described in terms of a directed acyclic graph in which nodes form a complex network of many-to-many relationships and are represented by a region of protein structure and sequence. The new classification project is expected to ensure new advances in the field and open new areas of research.

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Mineração de Dados , Internet , Proteínas/classificação

4.

Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine.

Nucleic Acids Res ; 41(Database issue): D499-507, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23203986

RESUMO

Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Proteínas/química , Proteínas/classificação , Proteínas/genética , Software

5.

Evolution of oligomeric state through geometric coupling of protein interfaces.

Perica, Tina; Chothia, Cyrus; Teichmann, Sarah A.

Proc Natl Acad Sci U S A ; 109(21): 8127-32, 2012 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-22566652

RESUMO

Oligomerization plays an important role in the function of many proteins. Thus, understanding, predicting, and, ultimately, engineering oligomerization presents a long-standing interest. From the perspective of structural biology, protein-protein interactions have mainly been analyzed in terms of the biophysical nature and evolution of protein interfaces. Here, our aim is to quantify the importance of the larger structural context of protein interfaces in protein interaction evolution. Specifically, we ask to what extent intersubunit geometry affects oligomerization state. We define a set of structural parameters describing the overall geometry and relative positions of interfaces of homomeric complexes with different oligomeric states. This allows us to quantify the contribution of direct sequence changes in interfaces versus indirect changes outside the interface that affect intersubunit geometry. We find that such indirect, or allosteric mutations affecting intersubunit geometry via indirect mechanisms are as important as interface sequence changes for evolution of oligomeric states.

Assuntos

Bactérias/genética , Proteínas de Bactérias/genética , Evolução Molecular , Interleucina-8/genética , Família Multigênica/fisiologia , Pentosiltransferases/genética , Proteínas Repressoras/genética , Triose-Fosfato Isomerase/genética , Sequência de Aminoácidos , Bacillus subtilis/genética , Proteínas de Bactérias/química , Sequência Conservada , Dimerização , Interleucina-8/química , Modelos Químicos , Dados de Sequência Molecular , Complexos Multiproteicos/química , Complexos Multiproteicos/genética , Mycobacterium tuberculosis/genética , Pentosiltransferases/química , Filogenia , Estrutura Quaternária de Proteína , Proteínas Repressoras/química , Thermotoga maritima/genética , Triose-Fosfato Isomerase/química

6.

SUPERFAMILY 1.75 including a domain-centric gene ontology method.

de Lima Morais, David A; Fang, Hai; Rackham, Owen J L; Wilson, Derek; Pethica, Ralph; Chothia, Cyrus; Gough, Julian.

Nucleic Acids Res ; 39(Database issue): D427-34, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21062816

RESUMO

The SUPERFAMILY resource provides protein domain assignments at the structural classification of protein (SCOP) superfamily level for over 1400 completely sequenced genomes, over 120 metagenomes and other gene collections such as UniProt. All models and assignments are available to browse and download at http://supfam.org. A new hidden Markov model library based on SCOP 1.75 has been created and a previously ignored class of SCOP, coiled coils, is now included. Our scoring component now uses HMMER3, which is in orders of magnitude faster and produces superior results. A cloud-based pipeline was implemented and is publicly available at Amazon web services elastic computer cloud. The SUPERFAMILY reference tree of life has been improved allowing the user to highlight a chosen superfamily, family or domain architecture on the tree of life. The most significant advance in SUPERFAMILY is that now it contains a domain-based gene ontology (GO) at the superfamily and family levels. A new methodology was developed to ensure a high quality GO annotation. The new methodology is general purpose and has been used to produce domain-based phenotypic ontologies in addition to GO.

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Proteínas/classificação , Genes , Fenótipo , Filogenia , Proteínas/química , Proteínas/genética , Análise de Sequência de Proteína , Software

7.

Ubiquitin--molecular mechanisms for recognition of different structures.

Perica, Tina; Chothia, Cyrus.

Curr Opin Struct Biol ; 20(3): 367-76, 2010 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-20456943

RESUMO

The role of ubiquitin in many of the known cellular processes, not just protein degradation, is based on its unique ability to bind a range of proteins that are structurally and functionally different. To understand how ubiquitin can bind to proteins with different structures, we review the extent of the conservation and variation that occur in the structures of two free ubiquitins and ubiquitins in 16 complexes that have been determined at high resolution (1.2-2A). Around 80% of the atomic groups in these structures have positions that differ less than 1A. This conserved core provides a rigid platform for flexible loop regions, 39 residues with side chains that can take up different conformations, and a flexible six-residue region at the C-terminus. In most cases the ability of ubiquitin to bind different structures is limited in part by a central set of residues that largely conserve their conformations. The accommodation of differences in binding proteins is enabled by changes in the flexible surface side chains, loop movements, different specific interactions, water molecules in the interface and the flexible C-terminus.

Assuntos

Ubiquitina/metabolismo , Sítios de Ligação , Sequência Conservada , Humanos , Ligação Proteica , Conformação Proteica , Especificidade por Substrato , Ubiquitina/química

8.

Genomic and structural aspects of protein evolution.

Chothia, Cyrus; Gough, Julian.

Biochem J ; 419(1): 15-28, 2009 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-19272021

RESUMO

It has been known for more than 35 years that, during evolution, new proteins are formed by gene duplications, sequence and structural divergence and, in many cases, gene combinations. The genome projects have produced complete, or almost complete, descriptions of the protein repertoires of over 600 distinct organisms. Analyses of these data have dramatically increased our understanding of the formation of new proteins. At the present time, we can accurately trace the evolutionary relationships of about half the proteins found in most genomes, and it is these proteins that we discuss in the present review. Usually, the units of evolution are protein domains that are duplicated, diverge and form combinations. Small proteins contain one domain, and large proteins contain combinations of two or more domains. Domains descended from a common ancestor are clustered into superfamilies. In most genomes, the net growth of superfamily members means that more than 90% of domains are duplicates. In a section on domain duplications, we discuss the number of currently known superfamilies, their size and distribution, and superfamily expansions related to biological complexity and to specific lineages. In a section on divergence, we describe how sequences and structures diverge, the changes in stability produced by acceptable mutations, and the nature of functional divergence and selection. In a section on domain combinations, we discuss their general nature, the sequential order of domains, how combinations modify function, and the extraordinary variety of the domain combinations found in different genomes. We conclude with a brief note on other forms of protein evolution and speculations of the origins of the duplication, divergence and combination processes.

Assuntos

Evolução Molecular , Genoma/genética , Proteínas/genética , Animais , Humanos , Mutação , Proteínas/química

9.

SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny.

Wilson, Derek; Pethica, Ralph; Zhou, Yiduo; Talbot, Charles; Vogel, Christine; Madera, Martin; Chothia, Cyrus; Gough, Julian.

Nucleic Acids Res ; 37(Database issue): D380-6, 2009 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-19036790

RESUMO

SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site.

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Proteínas/genética , Animais , Gráficos por Computador , Genômica , Humanos , Filogenia , Estrutura Terciária de Proteína/genética , Proteínas/classificação , Análise de Sequência de DNA , Análise de Sequência de Proteína

10.

Data growth and its impact on the SCOP database: new developments.

Andreeva, Antonina; Howorth, Dave; Chandonia, John-Marc; Brenner, Steven E; Hubbard, Tim J P; Chothia, Cyrus; Murzin, Alexey G.

Nucleic Acids Res ; 36(Database issue): D419-25, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18000004

RESUMO

The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. The SCOP hierarchy comprises the following levels: Species, Protein, Family, Superfamily, Fold and Class. While keeping the original classification scheme intact, we have changed the production of SCOP in order to cope with a rapid growth of new structural data and to facilitate the discovery of new protein relationships. We describe ongoing developments and new features implemented in SCOP. A new update protocol supports batch classification of new protein structures by their detected relationships at Family and Superfamily levels in contrast to our previous sequential handling of new structural data by release date. We introduce pre-SCOP, a preview of the SCOP developmental version that enables earlier access to the information on new relationships. We also discuss the impact of worldwide Structural Genomics initiatives, which are producing new protein structures at an increasing rate, on the rates of discovery and growth of protein families and superfamilies. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Proteínas/classificação , Bases de Dados de Proteínas/tendências , Evolução Molecular , Genômica , Internet , Proteínas/genética

11.

The selection of acceptable protein mutations.

Sasidharan, Rajkumar; Chothia, Cyrus.

Proc Natl Acad Sci U S A ; 104(24): 10080-5, 2007 Jun 12.

Artigo em Inglês | MEDLINE | ID: mdl-17540730

RESUMO

We have determined the general constraints that govern sequence divergence in proteins that retain entirely, or very largely, the same structure and function. To do this we collected data from three different groups of orthologous sequences: those found in humans and mice, in humans and chickens, and in Escherichia coli and Salmonella enterica. In total, these organisms have 21,738 suitable pairs of orthologs, and these contain nearly 2 million mutations. The three groups differ greatly in the taxa from which they come and/or in the time that separates them from their last common ancestor. Nevertheless, the results we obtain from the three different groups are strikingly similar. For each group, the orthologous sequence pairs were assigned to six different divergence categories on the basis of their sequence identities. For categories with the same divergence, common accepted mutations have similar frequencies and rank orders in the three groups. With divergence, the width of the range of common mutations grows in the same manner in each group. We examined the distribution of mutations in protein structures. With increasing divergence, mutations increase at different rates in the buried, intermediate, and exposed regions of protein structures in a manner that explains the exponential relationship between the divergence of structure and sequence. This work implies that commonly allowed mutations are selected by a set of general constraints that are well defined and whose nature varies with divergence.

Assuntos

Mutação , Proteínas/química , Proteínas/genética , Seleção Genética , Animais , Galinhas , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Evolução Molecular , Frequência do Gene , Variação Genética , Humanos , Camundongos , Salmonella/química , Salmonella/classificação , Salmonella/genética , Análise de Sequência de Proteína

12.

The generation of new protein functions by the combination of domains.

Bashton, Matthew; Chothia, Cyrus.

Structure ; 15(1): 85-99, 2007 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-17223535

RESUMO

During evolution, many new proteins have been formed by the process of gene duplication and combination. The genes involved in this process usually code for whole domains. Small proteins contain one domain; medium and large proteins contain two or more domains. We have compared homologous domains that occur in both one-domain proteins and multidomain proteins. We have determined (1) how the functions of the individual domains in the multidomain proteins combine to produce their overall functions and (2) the extent to which these functions are similar to those in the one-domain homologs. We describe how domain combinations increase the specificity of enzymes; act as links between domains that have functional roles; regulate activity; combine within one chain functions that can act either independently, in concert or in new contexts; and provide the structural framework for the evolution of entirely new functions.

Assuntos

Enzimas/química , Enzimas/genética , Evolução Molecular , Fusão Gênica , Estrutura Terciária de Proteína , Catálise , Sequência Conservada , Bases de Dados de Proteínas , Enzimas/metabolismo , Recombinação Genética , Homologia de Sequência de Aminoácidos

13.

The SUPERFAMILY database in 2007: families and functions.

Wilson, Derek; Madera, Martin; Vogel, Christine; Chothia, Cyrus; Gough, Julian.

Nucleic Acids Res ; 35(Database issue): D308-13, 2007 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-17098927

RESUMO

The SUPERFAMILY database provides protein domain assignments, at the SCOP 'superfamily' level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from http://supfam.org. The web interface includes services such as domain architectures and alignment details for all protein assignments, searchable domain combinations, domain occurrence network visualization, detection of over- or under-represented superfamilies for a given genome by comparison with other genomes, assignment of manually submitted sequences and keyword searches. In this update we describe the SUPERFAMILY database and outline two major developments: (i) incorporation of family level assignments and (ii) a superfamily-level functional annotation. The SUPERFAMILY database can be used for general protein evolution and superfamily-specific studies, genomic annotation, and structural genomics target suggestion and assessment.

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Genômica , Internet , Estrutura Terciária de Proteína/genética , Estrutura Terciária de Proteína/fisiologia , Proteínas/classificação , Interface Usuário-Computador

14.

3D complex: a structural classification of protein complexes.

Levy, Emmanuel D; Pereira-Leal, Jose B; Chothia, Cyrus; Teichmann, Sarah A.

PLoS Comput Biol ; 2(11): e155, 2006 Nov 17.

Artigo em Inglês | MEDLINE | ID: mdl-17112313

RESUMO

Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.

Assuntos

Bases de Dados de Proteínas , Modelos Químicos , Modelos Moleculares , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/ultraestrutura , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , Dados de Sequência Molecular , Complexos Multiproteicos/química , Complexos Multiproteicos/ultraestrutura , Ligação Proteica , Conformação Proteica , Proteínas/classificação

15.

Divergence of interdomain geometry in two-domain proteins.

Han, Jung-Hoon; Kerrison, Nicola; Chothia, Cyrus; Teichmann, Sarah A.

Structure ; 14(5): 935-45, 2006 May.

Artigo em Inglês | MEDLINE | ID: mdl-16698554

RESUMO

For homologous protein chains composed of two domains, we have determined the extent to which they conserve (1) their interdomain geometry and (2) the molecular structure of the domain interface. This work was carried out on 128 unique two-domain architectures. Of the 128, we find 75 conserve their interdomain geometry and the structure of their domain interface; 5 conserve their interdomain geometry but not the structure of their interface; and 48 have variable geometries and divergent interface structure. We describe how different types of interface changes or the absence of an interface is responsible for these differences in geometry. Variable interdomain geometries can be found in homologous structures with high sequence identities (70%).

Assuntos

Estrutura Terciária de Proteína , Proteínas/química , Homologia Estrutural de Proteína , Biologia Computacional , Homologia de Sequência de Aminoácidos

16.

Protein family expansions and biological complexity.

Vogel, Christine; Chothia, Cyrus.

PLoS Comput Biol ; 2(5): e48, 2006 May.

Artigo em Inglês | MEDLINE | ID: mdl-16733546

RESUMO

During the course of evolution, new proteins are produced very largely as the result of gene duplication, divergence and, in many cases, combination. This means that proteins or protein domains belong to families or, in cases where their relationships can only be recognised on the basis of structure, superfamilies whose members descended from a common ancestor. The size of superfamilies can vary greatly. Also, during the course of evolution organisms of increasing complexity have arisen. In this paper we determine the identity of those superfamilies whose relative sizes in different organisms are highly correlated to the complexity of the organisms. As a measure of the complexity of 38 uni- and multicellular eukaryotes we took the number of different cell types of which they are composed. Of 1,219 superfamilies, there are 194 whose sizes in the 38 organisms are strongly correlated with the number of cell types in the organisms. We give outline descriptions of these superfamilies. Half are involved in extracellular processes or regulation and smaller proportions in other types of activity. Half of all superfamilies have no significant correlation with complexity. We also determined whether the expansions of large superfamilies correlate with each other. We found three large clusters of correlated expansions: one involves expansions in both vertebrates and plants, one just in vertebrates, and one just in plants. Our work identifies important protein families and provides one explanation of the discrepancy between the total number of genes and the apparent physiological complexity of eukaryotic organisms.

Assuntos

Evolução Biológica , Biologia Computacional/métodos , Animais , Análise por Conglomerados , Bases de Dados Factuais , Evolução Molecular , Genes Fúngicos , Genes de Plantas , Humanos , Modelos Biológicos , Software

17.

VH gene segments in the mouse and human genomes.

de Bono, Bernard; Madera, Martin; Chothia, Cyrus.

J Mol Biol ; 342(1): 131-43, 2004 Sep 03.

Artigo em Inglês | MEDLINE | ID: mdl-15313612

RESUMO

We have examined the mouse genome sequence to determine its VH gene segment repertoire. In all, 141 segments are mapped to a 3 Mb region of chromosome 12. There is evidence that 92 of these are functional in the mouse strain used for the genome sequence, C57BL/6J; 12 are functional in other mouse strains, and 37 are pseudogenes. The mouse VH gene segment repertoire is therefore twice the size of that in humans. The mouse and human loci bear no large-scale similarity to each other. The 104 functional segments belong to one of the 15 known sequence subgroups, which have been further clustered into eight sets here. Seven of these sets, comprising 101 sequences, are related to five of the human VH families and have the same canonical structures in their hypervariable regions. Duplication of members of one set in the distal half of the locus is mainly responsible for the larger size of the mouse repertoire. Phylogenetic analysis of the VH segments indicates that most of the sequences in the human and mouse VH loci have arisen subsequent to the divergence of the two organisms from their common ancestor.

Assuntos

Genes de Imunoglobulinas , Genoma , Cadeias Pesadas de Imunoglobulinas/genética , Região Variável de Imunoglobulina/genética , Sequência de Aminoácidos , Animais , Evolução Molecular , Variação Genética , Humanos , Cadeias Pesadas de Imunoglobulinas/classificação , Região Variável de Imunoglobulina/classificação , Funções Verossimilhança , Camundongos , Dados de Sequência Molecular , Família Multigênica , Filogenia , Alinhamento de Sequência

18.

The linked conservation of structure and function in a family of high diversity: the monomeric cupredoxins.

Gough, Julian; Chothia, Cyrus.

Structure ; 12(6): 917-25, 2004 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-15274913

RESUMO

The monomeric cupredoxins are a highly divergent family of copper binding electron transport proteins that function in photosynthesis and respiration. To determine how function and structure are conserved in the context of large sequence differences, we have carried out a detailed analysis of the cupredoxins of known structure and their sequence homologs. The common structure of the cupredoxins is formed by a sandwich of two beta sheets which support a copper binding site. The structure of the deeply buried core is intimately coupled to the binding site on the surface of the protein; in each protein the conserved regions form one continuous substructure that extends from the surface active site and through the center of the molecule. Residues around the active site are conserved for functional reasons, while those deeper in the structure will be conserved for structural reasons. Together the two sets support each other.

Assuntos

Azurina/análogos & derivados , Azurina/química , Sítios de Ligação , Simulação por Computador , Transporte de Elétrons , Ligação de Hidrogênio , Modelos Moleculares , Consumo de Oxigênio , Fotossíntese , Plastocianina/química , Ligação Proteica , Conformação Proteica , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Relação Estrutura-Atividade

19.

Structure, function and evolution of multidomain proteins.

Vogel, Christine; Bashton, Matthew; Kerrison, Nicola D; Chothia, Cyrus; Teichmann, Sarah A.

Curr Opin Struct Biol ; 14(2): 208-16, 2004 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-15093836

RESUMO

Proteins are composed of evolutionary units called domains; the majority of proteins consist of at least two domains. These domains and nature of their interactions determine the function of the protein. The roles that combinations of domains play in the formation of the protein repertoire have been found by analysis of domain assignments to genome sequences. Additional findings on the geometry of domains have been gained from examination of three-dimensional protein structures. Future work will require a domain-centric functional classification scheme and efforts to determine structures of domain combinations.

Assuntos

Simulação por Computador , Evolução Molecular , Estrutura Terciária de Proteína , Software

20.

SCOP database in 2004: refinements integrate structure and sequence family data.

Andreeva, Antonina; Howorth, Dave; Brenner, Steven E; Hubbard, Tim J P; Chothia, Cyrus; Murzin, Alexey G.

Nucleic Acids Res ; 32(Database issue): D226-9, 2004 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-14681400

RESUMO

The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are hierarchically classified into families, superfamilies, folds and classes. The continual accumulation of sequence and structural data allows more rigorous analysis and provides important information for understanding the protein world and its evolutionary repertoire. SCOP participates in a project that aims to rationalize and integrate the data on proteins held in several sequence and structure databases. As part of this project, starting with release 1.63, we have initiated a refinement of the SCOP classification, which introduces a number of changes mostly at the levels below superfamily. The pending SCOP reclassification will be carried out gradually through a number of future releases. In addition to the expanded set of static links to external resources, available at the level of domain entries, we have started modernization of the interface capabilities of SCOP allowing more dynamic links with other databases. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.

Assuntos

Bases de Dados de Proteínas , Proteínas/química , Proteínas/classificação , Animais , Anticorpos/química , Anticorpos/classificação , Proteínas do Capsídeo/química , Proteínas do Capsídeo/classificação , Biologia Computacional , Humanos , Internet , Proteínas Quinases/química , Proteínas Quinases/classificação , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA